Lexical tuning based on triphone confidence estimation

نویسندگان

  • Kevin L. Markey
  • Wayne H. Ward
چکیده

We propose and test a practical means of finding poor pronunciations and missing variants for large lexicons. We do so by statistically assessing the confidence of each phone in each pronunciation and comparing it with the statistical distribution of the same confidence metric for corresponding phones over the entire training corpus. A phone is targeted for correction for each word in which its mean score is significantly less than the phone's mean score over the entire training corpus. Neighboring phones are also reviewed for their contribution to the target phone's poor score. Thus far, we have experimented with this technique by manually correcting the pronunciation. In experiments with Wall Street Journal and dictated physical examination corpora, word error rates were reduced commensurate with the number of dictionary entries whose pronunciations were corrected as result of this process.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A two-layer lexical tree based beam search in continuous Chinese speech recognition

In this paper, an approach to continuous speech recognition based on a two-layer lexical tree is proposed. The search network is maintained by the two-layer lexical tree, in which the first layer reflects the word net and the phone net while the second layer the dynamic programming (DP). Because the acoustic information is tied in the second layer, the memory cost is so small that it has the ab...

متن کامل

Voice Assimilation Phenomenon and Its Implementation in LVCSR System with Lexical Tree and Bigram Language Model

In this paper a LVCSR system with implementation of the Czech voice assimilation phenomenon is proposed. The recognition system uses lexical trees and a bigram language model. The first part of this article is focused on voice assimilation phenomenon description, triphone lexical tree construction, and voice assimilation impact on LVCSR system performance. The second part outlines lexical tree ...

متن کامل

An EKF-based algorithm for learning statistical hidden dynamic model parameters for phonetic recognition

This paper presents a new parameter estimation algorithm based on the Extended Kalman Filter (EKF) for the recently proposed statistical coarticulatory Hidden Dynamic Model (HDM). We show how the EKF parameter estimation algorithm unifies and simplifies the estimation of both the state and parameter vectors. Experiments based on N-best rescoring demonstrate superior performance of the (contexti...

متن کامل

AN Em-BASED ALGORITHM FOR LEARNING STATISTICAL HIDDEN DYNAMIC MODEL PARAMETERS FOR PHONETIC RECOGNITION

This paper presents a new parameter estimation algorithm based on the Extended Kalman Filter (EKF) for the recently proposed statistical coarticulatory Hidden Dynamic Model (HDM). We show how the EKF parameter estimation algorithm unifies and simplifies the estimation of both the state and parameter vectors. Experiments based on N-best rescoring demonstrate superior performance of the (contexti...

متن کامل

A Voice Dictation System for a Million-Word Czech Vocabulary

The paper describes a set of techniques developed for discrete dictation within a vocabulary that contains up to a million entries, which is one of the main challenges in highly inflected languages like Czech. We present our approach to building an efficiently coded tree lexicon with suffix sub-trees and morphologic classification. Acoustic modeling is based on either monophone, diphone, or tri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997